智能论文笔记

Review of Methods for Handling Class-Imbalanced in Classification Problems

Satyendra Singh Rawat , Amit Kumar Mishra

分类：机器学习

2022-11-10

Learning classifiers using skewed or imbalanced datasets can occasionally lead to classification issues; this is a serious issue. In some cases, one class contains the majority of examples while the other, which is frequently the more important class, is nevertheless represented by a smaller proportion of examples. Using this kind of data could make many carefully designed machine-learning systems ineffective. High training fidelity was a term used to describe biases vs. all other instances of the class. The best approach to all possible remedies to this issue is typically to gain from the minority class. The article examines the most widely used methods for addressing the problem of learning with a class imbalance, including data-level, algorithm-level, hybrid, cost-sensitive learning, and deep learning, etc. including their advantages and limitations. The efficiency and performance of the classifier are assessed using a myriad of evaluation metrics.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

HEARTS: Multi-task Fusion of Dense Retrieval and Non-autoregressive Generation for Sponsored Search

Bhargav Dodla , Akash Kumar Mohankumar , Amit Singh

分类：自然语言处理

2022-09-13

将用户搜索查询与广告商实时竞标相关的关键字匹配是赞助搜索中的一个至关重要问题。在文献中，已经探索了两种广泛的方法来解决此问题：（i）在共享空间中学习查询和出价关键字的密集检索（DR），以及（ii）自然语言生成（NLG） - 学会直接生成给定查询的投标关键字。在这项工作中，我们首先对这两种方法进行了实证研究，并表明它们提供了添加剂的补充优势。特别是，从NLG检索到的很大一部分的关键字尚未由DR和反之亦然。然后，我们证明有可能将这两种方法的优势有效地结合在一个模型中。具体而言，我们提出了心脏：一种新型的多任务融合框架，在该框架中，我们共同优化共享编码器以同时执行DR和非自动性NLG。通过对30多个跨越20多种语言的搜索查询进行的广泛实验，我们表明，与使用相同的GPU计算的基线方法相比，心脏检索高质量的出价关键字40.3％。我们还证明，在单个心脏模型上推断与在两种不同的DR和NLG基线模型上推断为2倍计算一样好。此外，我们表明，接受心脏目标训练的DR模型要比接受标准对比度损失功能的训练的模型要好得多。最后，我们表明我们的心目标可以用于除赞助搜索并实现显着绩效提高以外的短文本检索任务。

translated by 谷歌翻译

Deriving Surface Resistivity from Polarimetric SAR Data Using Dual-Input UNet

Bibin Wilson , Rajiv Kumar , Narayanarao Bhogapurapu , Anand Singh , Amit Sethi

分类：计算机视觉 | 机器学习

2022-07-05

发现表面电阻率的传统调查方法是耗时的和劳动量的。很少有研究重点是使用遥感数据和深度学习技术找到电阻率/电导率。在这一工作中，我们通过应用各种深度学习方法评估了表面电阻率和合成孔径雷达（SAR）之间的相关性，并在美国Coso地热区域中测试了我们的假设。为了检测电阻率，使用了UAVSAR获得的L波段全偏光SAR数据，并将MT（MagnEtoteltolarics）反向电阻率数据用作地面真相。我们进行了实验，以比较各种深度学习体系结构，并建议使用双输入UNET（DI-UNET）体系结构。 Di-Unet使用深度学习架构使用完整的极化SAR数据来预测电阻率，并承诺对传统方法进行快速调查。我们提出的方法实现了从SAR数据中映射MT电阻率的结果。

translated by 谷歌翻译

Aesthetic Attribute Assessment of Images Numerically on Mixed Multi-attribute Datasets

Xin Jin , Xinning Li , Hao Lou , Chenyu Fan , Qiang Deng , Chaoen Xiao , Shuai Cui , Amit Kumar Singh

分类：计算机视觉

2022-07-05

随着社交软件和多媒体技术的持续发展，图像已成为传播信息和社交的重要载体。如何全面评估图像已成为最近研究的重点。传统的图像美学评估方法通常采用单个数值总体评估评分，该评估具有一定的主观性，无法再满足更高的美学要求。在本文中，我们构建了一个称为Aesthetic混合数据集的新图像属性数据集，该数据集具有属性（AMD-A）和设计融合的外部属性功能。此外，我们还提出了一种有效的方法，用于在混合多属性数据集上进行图像美学属性评估，并通过使用ExtisticNet-B0作为骨干网络来构建多任务网络体系结构。我们的模型可以实现美学分类，整体评分和属性评分。在每个子网络中，我们通过ECA通道注意模块改进特征提取。至于最终的整体评分，我们采用了教师学习网络的想法，并使用分类子网络来指导美学的整体细粒回归。实验结果，使用思维螺旋式的结果表明，我们提出的方法可以有效地改善美学整体和属性评估的性能。

translated by 谷歌翻译

Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting

Benjamin Wilson , William Qi , Tanmay Agarwal , John Lambert , Jagjeet Singh , Siddhesh Khandelwal , Bowen Pan , Ratnesh Kumar , Andrew Hartnett , Jhony Kaesemodel Pontes

分类：计算机视觉 | 人工智能 | 机器学习 | 机器人

2023-01-02

We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.

translated by 谷歌翻译

Biomedical image analysis competitions: The state of current participation practice

Matthias Eisenmann , Annika Reinke , Vivienn Weru , Minu Dietlinde Tizabi , Fabian Isensee , Tim J. Adler , Patrick Godau , Veronika Cheplygina , Michal Kozubek , Sharib Ali

分类：计算机视觉 | 机器学习

2022-12-16

The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.

translated by 谷歌翻译

A smart resource management mechanism with trust access control for cloud computing environment

Sakshi Chhabra , Ashutosh Kumar Singh

分类：人工智能

2022-12-10

The core of the computer business now offers subscription-based on-demand services with the help of cloud computing. We may now share resources among multiple users by using virtualization, which creates a virtual instance of a computer system running in an abstracted hardware layer. It provides infinite computing capabilities through its massive cloud datacenters, in contrast to early distributed computing models, and has been incredibly popular in recent years because to its continually growing infrastructure, user base, and hosted data volume. This article suggests a conceptual framework for a workload management paradigm in cloud settings that is both safe and performance-efficient. A resource management unit is used in this paradigm for energy and performing virtual machine allocation with efficiency, assuring the safe execution of users' applications, and protecting against data breaches brought on by unauthorised virtual machine access real-time. A secure virtual machine management unit controls the resource management unit and is created to produce data on unlawful access or intercommunication. Additionally, a workload analyzer unit works simultaneously to estimate resource consumption data to help the resource management unit be more effective during virtual machine allocation. The suggested model functions differently to effectively serve the same objective, including data encryption and decryption prior to transfer, usage of trust access mechanism to prevent unauthorised access to virtual machines, which creates extra computational cost overhead.

translated by 谷歌翻译

Tree DNN: A Deep Container Network

Brijraj Singh , Swati Gupta , Mayukh Das , Praveen Doreswamy Naidu , Sharan Kumar Allur

分类：机器学习 | 人工智能

2022-12-07

Multi-Task Learning (MTL) has shown its importance at user products for fast training, data efficiency, reduced overfitting etc. MTL achieves it by sharing the network parameters and training a network for multiple tasks simultaneously. However, MTL does not provide the solution, if each task needs training from a different dataset. In order to solve the stated problem, we have proposed an architecture named TreeDNN along with it's training methodology. TreeDNN helps in training the model with multiple datasets simultaneously, where each branch of the tree may need a different training dataset. We have shown in the results that TreeDNN provides competitive performance with the advantage of reduced ROM requirement for parameter storage and increased responsiveness of the system by loading only specific branch at inference time.

translated by 谷歌翻译

Objects as Spatio-Temporal 2.5D points

Paridhi Singh , Gaurav Singh , Arun Kumar

分类：计算机视觉 | 人工智能

2022-12-06

Determining accurate bird's eye view (BEV) positions of objects and tracks in a scene is vital for various perception tasks including object interactions mapping, scenario extraction etc., however, the level of supervision required to accomplish that is extremely challenging to procure. We propose a light-weight, weakly supervised method to estimate 3D position of objects by jointly learning to regress the 2D object detections and scene's depth prediction in a single feed-forward pass of a network. Our proposed method extends a center-point based single-shot object detector \cite{zhou2019objects}, and introduces a novel object representation where each object is modeled as a BEV point spatio-temporally, without the need of any 3D or BEV annotations for training and LiDAR data at query time. The approach leverages readily available 2D object supervision along with LiDAR point clouds (used only during training) to jointly train a single network, that learns to predict 2D object detection alongside the whole scene's depth, to spatio-temporally model object tracks as points in BEV. The proposed method is computationally over $\sim$10x efficient compared to recent SOTA approaches [1, 38] while achieving comparable accuracies on KITTI tracking benchmark.

translated by 谷歌翻译